Search CORE

5 research outputs found

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

Author: Ding Gavin Weiguang
Huang Ruitong
Lui Kry Yik Chau
Sharma Yash
Publication venue
Publication date: 04/03/2020
Field of study

We study adversarial robustness of neural networks from a margin maximization perspective, where margins are defined as the distances from inputs to a classifier's decision boundary. Our study shows that maximizing margins can be achieved by minimizing the adversarial loss on the decision boundary at the "shortest successful perturbation", demonstrating a close connection between adversarial losses and the margins. We propose Max-Margin Adversarial (MMA) training to directly maximize the margins to achieve adversarial robustness. Instead of adversarial training with a fixed

\epsilon

, MMA offers an improvement by enabling adaptive selection of the "correct"

\epsilon

as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training's efficacy on the MNIST and CIFAR10 datasets w.r.t.

\ell_\infty

and

\ell_2

robustness. Code and models are available at https://github.com/BorealisAI/mma_training.Comment: Published at the Eighth International Conference on Learning Representations (ICLR 2020), https://openreview.net/forum?id=HkeryxBtP

arXiv.org e-Print Archive

Implicit Manifold Learning on Generative Adversarial Networks

Author: Cao Yanshuai
Gazeau Maxime
Lui Kry Yik Chau
Zhang Kelvin Shuangjian
Publication venue
Publication date: 30/10/2017
Field of study

This paper raises an implicit manifold learning perspective in Generative Adversarial Networks (GANs), by studying how the support of the learned distribution, modelled as a submanifold

\mathcal{M}_{\theta}

, perfectly match with

\mathcal{M}_{r}

, the support of the real data distribution. We show that optimizing Jensen-Shannon divergence forces

\mathcal{M}_{\theta}

to perfectly match with

\mathcal{M}_{r}

, while optimizing Wasserstein distance does not. On the other hand, by comparing the gradients of the Jensen-Shannon divergence and the Wasserstein distances (

W_1

and

W_2^2

) in their primal forms, we conjecture that Wasserstein

W_2^2

may enjoy desirable properties such as reduced mode collapse. It is therefore interesting to design new distances that inherit the best from both distances

arXiv.org e-Print Archive

Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds

Author: Ding Gavin Weiguang
Huang Ruitong
Lui Kry Yik Chau
McCann Robert J.
Publication venue
Publication date: 31/10/2018
Field of study

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the

L_2

-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.Comment: 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montreal, Canad

arXiv.org e-Print Archive

Improving GAN Training via Binarized Representation Entropy (BRE) Regularization

Author: Cao Yanshuai
Ding Gavin Weiguang
Huang Ruitong
Lui Kry Yik-Chau
Publication venue
Publication date: 09/05/2018
Field of study

We propose a novel regularizer to improve the training of Generative Adversarial Networks (GANs). The motivation is that when the discriminator D spreads out its model capacity in the right way, the learning signals given to the generator G are more informative and diverse. These in turn help G to explore better and discover the real data manifold while avoiding large unstable jumps due to the erroneous extrapolation made by D. Our regularizer guides the rectifier discriminator D to better allocate its model capacity, by encouraging the binary activation patterns on selected internal layers of D to have a high joint entropy. Experimental results on both synthetic data and real datasets demonstrate improvements in stability and convergence speed of the GAN training, as well as higher sample quality. The approach also leads to higher classification accuracies in semi-supervised learning.Comment: Published as a conference paper at the 6th International Conference on Learning Representations (ICLR 2018

arXiv.org e-Print Archive

On the Sensitivity of Adversarial Robustness to Input Data Distributions

Author: Ding Gavin Weiguang
Huang Ruitong
Jin Xiaomeng
Lui Kry Yik Chau
Wang Luyu
Publication venue
Publication date: 21/02/2019
Field of study

Neural networks are vulnerable to small adversarial perturbations. Existing literature largely focused on understanding and mitigating the vulnerability of learned models. In this paper, we demonstrate an intriguing phenomenon about the most popular robust training method in the literature, adversarial training: Adversarial robustness, unlike clean accuracy, is sensitive to the input data distribution. Even a semantics-preserving transformations on the input data distribution can cause a significantly different robustness for the adversarial trained model that is both trained and evaluated on the new distribution. Our discovery of such sensitivity on data distribution is based on a study which disentangles the behaviors of clean accuracy and robust accuracy of the Bayes classifier. Empirical investigations further confirm our finding. We construct semantically-identical variants for MNIST and CIFAR10 respectively, and show that standardly trained models achieve comparable clean accuracies on them, but adversarially trained models achieve significantly different robustness accuracies. This counter-intuitive phenomenon indicates that input data distribution alone can affect the adversarial robustness of trained neural networks, not necessarily the tasks themselves. Lastly, we discuss the practical implications on evaluating adversarial robustness, and make initial attempts to understand this complex phenomenon.Comment: ICLR 2019, Seventh International Conference on Learning Representation

arXiv.org e-Print Archive